Team, Visitors, External Collaborators
Overall Objectives
Research Program
Application Domains
Highlights of the Year
New Software and Platforms
New Results
Bilateral Contracts and Grants with Industry
Partnerships and Cooperations
Dissemination
Bibliography
XML PDF e-pub
PDF e-Pub


Section: New Results

Graph-based Text Analytics

Participants: Fragkiskos Malliaros (in collaboration with Konstantinos Skianis and Michalis Vazirgiannis, École Polytechnique)

Text categorization is a core task in a plethora of text mining applications. In our work, contrary to the traditional Bag-of-Words approach, we have considered the Graph-of-Words model in which each document is represented by a graph that encodes relationships between the different terms. Based on this formulation, we treat the term weighting task as a node ranking problem; the importance of a term is determined by the importance of the corresponding node in the graph, using node centrality criteria. We have also introduced novel graph-based weighting schemes by enriching graphs with word-embedding distances, in order to reward or penalize the importance of semantically close terms [39]. Our methods produce more discriminative feature weights for text categorization, outperforming existing frequency-based criteria – highlighting also the importance of graph-based methods in text analytics and natural language processing in general.